Individual Poster Page

See copyright notice at the bottom of this page.

List of All Posters

 


Banner Years

November 6, 2002 - F James

I think there is an important point being missed here. The original question you posed was: Given the set of all players who hit at least 25% better than league average for 3 years in a row, what was their relative performance in Year 4? That was what led to the 149, 149, 149, 142 sequence. You then assert that 142, not 149, represents that groups real ability, and that you can expect a typical player in the group to regress toward the mean by 7/49 = 14%. But what is really going on here? I strongly suspect that most of the apparent regression is caused not by most players falling by moderate amounts but by a few players falling by big amounts. Remember, a player had to be at least 25% better than average 3 years running to make the cut, but there is no such requirement for Year 4. It would only take a few players falling below the 25% threshhold to entirely explain the 7 point drop in the average.

To illustrate, suppose we started with 5 players whose scores in each of the 1st 3 years were distributed as follows:

135, 142, 149, 156, 163. The average is 149.

Now suppose the player who scored 135 before has a really bad year and falls to 100 while the other 4 stay the same. The new average is 142, implying a 7 point regression for the group. But in reality it was caused by the simple fact that you no longer required performance 25% better than average.

To see how much of the "regression" is real and how much is due to the relaxation of the selection criterion segregate your data base into 2 groups: those that met the +25% criterion in all years and those that did not.


Banner Years

November 7, 2002 - F James

Sorry, MGL, but I've got to shoot down your argument. Sure, any group of players will regress toward the mean. The question is, what mean? It is NOT the mean of all players (i.e., 100) but the true mean for the group itself. If you assemble a team of All-Stars, you most certainly would not expect their true mean to be the overall population mean. Just because you can't know precisely what that number is doesn't mean you should assume it is 100. Indeed, if all you have to go on is their 3-year performance (e.g., 149, 149, 149) then your best estimate of the true mean for the group is 149.


Banner Years

November 7, 2002 - F James

You are confusing random sampling and selective sampling. If I select a group of major league players at random, by throwing darts at a dartboard, then I would expect them to regress to the overall population mean. But if I deliberately set out to choose the best players in the game, I would expect them to maintain their advantage over the rest of the league from year to year, except for a slight aging effect. Yes, there will be a few ringers in any group of All-Stars, average players who got extremely lucky for one year. There may even be one or two who can do it 2 years in a row. But by requiring at least 125 performance for 3 consecutive years you have virtually guaranteed that only the very best will qualify for your sample. This is a highly selective sample; it will NOT regress to the overall population mean.


Banner Years

November 8, 2002 - F James

First of all, your Experiment #5 is fundamentally different than Tango's. His requirement to enter his sample was that each player had to outperform the population as a whole by at least 25% for 3 years in a row. Your requirement is only that each player outperform the population for one specific month. ANY player can have an outstanding month, even Rey Ordonez. Thus, ANY player has a non-zero chance of showing up in your sample. Rey Ordonez has NO chance of showing up in Tango's sample, even if you simulated his performance over a thousand years. So your sample is fairly representative of the population at large. Tango's is NOT.

Not only are the Rey Ordonezes of baseball excluded entirely from Tango's sample (but not from yours), but I would go further and say that even average players are effectively excluded from his sample. As I said in my last post, "Yes, there will be a few ringers in any group of All-Stars, average players who got extremely lucky for one year. There may even be one or two who can do it 2 years in a row. But by requiring at least 125 performance for 3 consecutive years you have virtually guaranteed that only the very best will qualify for your sample."

Tango, you can help us out here. Of the 655 strings of players in your Study 3, what was their distribution of career performance scores prior to Year 1? That is, how many were under 100, how many were in the 100 to 109 range, 110 to 119, etc. And how about their career performance after Year 3?


Copyright notice

Comments on this page were made by person(s) with the same handle, in various comments areas, following Tangotiger © material, on Baseball Primer. All content on this page remain the sole copyright of the author of those comments.

If you are the author, and you wish to have these comments removed from this site, please send me an email (tangotiger@yahoo.com), along with (1) the URL of this page, and (2) a statement that you are in fact the author of all comments on this page, and I will promptly remove them.